NPRG036 - Data Formats

Exams

Exam dates are already in SIS.

See a sample test.

Lectures - Mondays 09:00 in S4

  1. 2023-10-02: Data formats introduction: Google Slides, YouTube (English), YouTube (Czech)
  2. 2023-10-09: Graph data formats - RDF, RDF Schema, Linked Data, Open World Assumption: Google Slides, YouTube (English), YouTube (Czech)
  3. 2023-10-16: Graph data formats - SPARQL: Google Slides, YouTube (English), YouTube (Czech)
  4. 2023-10-23: Graph data formats - Basic vocabularies, Wikidata: Google Slides, YouTube (English), YouTube (Czech)
  5. 2023-10-30: Graph data formats - Labeled property graph model, Cypher, RDF-star: Google Slides, YouTube (English), YouTube (Czech)
  6. 2023-11-06: Hierarchical data formats - XML, XML Schema: Google Slides, YouTube (English), YouTube (Czech)
  7. 2023-11-13: Hierarchical data formats - XPath, XSLT: Google Slides, YouTube (English), YouTube (Czech)
  8. 2023-11-20: Hierarchical data formats - JSON, JSON Schema, JSON-LD: Google Slides, YouTube (English), YouTube (Czech)
  9. 2023-11-27: Relational data formats - SQL dump, CSV, CSV on the Web: Google Slides, YouTube (English), YouTube (Czech)
  10. 2023-12-04: Formats for geodata by guest speaker Michal Med: PDF, YouTube
  11. 2023-12-11: Key-value, configuration formats - .properties, INI, TOML, YAML: Google Slides, YouTube (English), YouTube (Czech)
  12. 2023-12-18: Formats for text documents: Google Slides, YouTube (English), YouTube (Czech)
  13. 2024-01-08: Multimedia formats - images, video, audio, containers, print formats: Google Slides, YouTube (English), YouTube (Czech), Print formats on YouTube (Czech)

Tutorials SU2

In this section, the links to tutorials with examples are available. There are three instances of tutorials per week. The tutorials are split into (R) Recommended, where we go through what you need for the homework, and (O) Optional, which are shorter and you can practice them at home, and therefore come to the tutorial only if you need to consult something (the homework).

Schedule and slides

The slides contain assignments to be practiced during the tutorial. In case of problems consult during the tutorial.

  1. Week 1 (R): Conceptual Modeling
  2. Week 2 (R): RDF
  3. Week 3 (R): SPARQL
  4. Week 4 (O): Wikidata
  5. Week 5 (R): LPG & Cypher
  6. Week 6 (R): XML & XML Schema
  7. Week 7 (R): XPath & XSLT
  8. Week 8 (R): JSON, jq, JSON Schema, JSON-LD
  9. Week 9 (O): HW part 3 (hierarchical formats) consultations
  10. Week 10 (R): CSV, CSV on the Web
  11. Week 11 (O): Geodata - GeoJSON, WKT, CRS, QGIS
  12. Week 12 (O): Key-value formats - TOML, YAML
  13. Week 13: Holidays
  14. Week 14 (O): Multimedia formats, Formats for text documents

Homework

Homework will be done in groups and will have 4 parts. All 4 parts of homework need to be turned in using the SIS Study group roster module before the individual deadlines in order to proceed to the final exam. The tutor's comments to the homework solutions need to be addressed when the next part is turned in. Before turning in a homework part, double-check the assignment and common errors and make sure you satisfy all requirements.

Final deadline for fixing all HW feedback is 2024-01-10T20:00:00. There must be no errors in the HWs by then.

Homework part 1: Conceptual model

Assignment
See the homework 1 assignment.

Homework part 2: Graph models

Assignment
See the homework 2 assignment.

Homework part 3: Hierarchical models

Assignment
See the homework 3 assignment.

Homework part 4: Relational model

Assignment
See the homework 4 assignment.

Common troubles with group homework

Group member or leader not communicating or not doing their part
  1. Contact me, do not hesitate. I will contact the not communicating member demanding explanation.
    • This may be due to illness, which can happen
    • If necessary, I will remove the member from the group
    • If necessary, I will appoint a new group leader
  2. Group size reduction is not a reason for reduction of the homework scope
    • Assignments are doable even single-handedly, but teamwork is part of the experience
  3. Not communicating group member is not a reason for deadline extension
    • Do your homework early, not a day before deadline
    • Set internal team deadlines, check your groupmates’ solution
    • It is unacceptable to say you missed a deadline because one teammate was responsible for a certain task and did not deliver.
  4. Communicate!
    • If you are ill or otherwise unable to work, let your group know ASAP
    • If you are removed from a team, you will fail this course

Homework feedback

You will receive feedback on your homework from me via e-mail. The feedback may be one of the following kinds:

Everything is OK and you get a ✅ in SIS.
Minor issues
You get a ✅ in SIS. You need to fix those along with the next HW.
Regular issues
You do not get ✅ in SIS until you fix them. You need to fix them along with the next HW to be able to continue. If you do not fix those with the next HW, you fail the course.
Major issues
You need to fix those ASAP and let me know when you do. These issues will prevent you from doing the next assignment correctly. If you do not fix those with the next HW at the latest, you fail the course.
Fatal issues
Typically resulting from not following instructions in the HW assignments, or completely missing parts. You need to fix those ASAP and let me know when you do. If this kind of issue appears for the second time, you fail the course.
Missed deadline
In case the deadline passes and there is no solution turned in by your group, you fail the course, unless the reason is serious, e.g. medical.